CKMorph: a comprehensive morphological analyzer for Central Kurdish
نویسندگان
چکیده
A morphological analyzer, a significant component of many natural language processing applications, especially for morphologically rich languages, divides an input word into all its composing morphemes and identifies their roles. This paper introduces comprehensive analyzer Central Kurdish (CK), also known as Sorani, low-resourced with morphology. Building upon the limited existing literature, we first assembled systematically categorized extensive collection morphophonological rules language. Additionally, collected manually labeled generative lexicon containing nearly 10,000 verb, noun adjective stems, named entities, other types stems. We used these rule sets resources to implement CKMorph Analyzer based on finite-state transducers. In order provide benchmark future research, collected, labeled, publicly shared test evaluating accuracy coverage analyzer. was able correctly analyze 95.9% set, 1000 CK words analyzed according context. Moreover, gave at least one analysis 95.5% 4.22 M tokens second set. The demonstration application resources, including verb database sets, are openly accessible github.com/CKMorph.
منابع مشابه
A Comprehensive Morphological Analyzer for Swedish
SWETWOL is implemented in the framework of Koskenniemi’s (1983) two-level model. It contains a 48,000 item lexicon and a full inflectional description. Special attention was paid to the design of a computational analysis of productive Swedish compounds. Recall (coverage) and precision of SWETWOL meet high standards. SWETWOL has been extensively tested on various types of texts.
متن کاملMorphological Analyzer for Kokborok
Morphological analysis is concerned with retrieving the syntactic and morphological properties or the meaning of a morphologically complex word. Morphological analysis retrieves the grammatical features and properties of an inflected word. However, this paper introduces the design and implementation of a Morphological Analyzer for Kokborok, a resource constrained and less computerized Indian la...
متن کاملA Morphological Analyzer for Filipino Verbs
This paper presents a morphological analyzer that accepts Filipino verbs conjugated in different forms as inputs and analyzes them to produce the affixes used, the infinitive forms, and the tenses of the original input verbs. A prototype system was implemented and was fed with a file containing 1,050 Filipino verbs conjugated in various tenses using different types of affixes. The preliminary r...
متن کاملA morphological Analyzer for Standard Albanian
In this paper, we present a morphological analyzer for standard Albanian intended as a component of an annotation tool in the context of the Albanian Corpus Initiative. The analyzer uses off-line components for generating sub-regular and irregular word forms based on the verb inflector described in Trommer (1997) and simple morphological rules for main inflectional patterns. Part of the analyze...
متن کاملVenPro: A Morphological Analyzer for Venetan
This document reports the process of extending MorphoPro for Venetan, a lesser-used language spoken in the Nort-Eastern part of Italy. MorphoPro is the morphological component of TextPro, a suite of tools oriented towards a number of NLP tasks. In order to extend this component to Venetan, we developed a declarative representation of the morphological knowledge necessary to analyze and synthesi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Digital Humanities
سال: 2023
ISSN: ['2524-7832', '2524-7840']
DOI: https://doi.org/10.1007/s42803-022-00062-7